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, Abstract 

. The avoidability of binary patterns by binary cube-free words is investigated 

CN I and the exact bound between unavoidable and avoidable patterns is found. All 

avoidable patterns are shown to be DOL- avoidable. For avoidable patterns, the 
growth rates of the avoiding languages are studied. All such languages, except for 
the overlap-free language, are proved to have exponential growth. The exact growth 
' rates of languages avoiding minimal avoidable patterns are approximated through 

computer-assisted upper bounds. Finally, a new example of a pattern-avoiding 
language of polynomial growth is given. 

Hh 

1 Introduction 

Factorial languages, i.e., languages closed under taking factors of their words, constitute 
! a wide and important class. Each factorial language can be denned by a set of forbidden 
(avoided) structures: factors, patterns, powers, Abelian powers, etc. In this paper, we 
; consider languages avoiding sets of patterns. 

Pattern avoidance is one of the classical topics in combinatorics of words. Recall 
that patterns are words over the auxiliary alphabet of variables. These variables admit 
arbitrary non-empty words over the main alphabet as values. A word over the main 
alphabet meets the pattern if some factor of this word can be obtained from the pattern 
by assigning values to the variables, and avoids the pattern otherwise. 

The main question concerning the avoidance of any set of forbidden structures is 
whether the language of all avoiding words over the main alphabet is finite or infinite. 
The set is called unavoidable in the first case and avoidable in the second case. We use 
the terms k- (un) avoidable to specify the cardinality of the main alphabet. 

If a set of structures is avoidable, then the second question is how big is the avoiding 
language in terms of growth. In general, a simple constraint usually defines either a finite 
language or a language of exponential growth. So, the examples of languages having 
subexponential (e.g., polynomial) growth are quite valuable. 

For languages avoiding patterns, the main question is far from being satisfactorily 
answered even for the case of a single pattern. A complete description of the pairs 
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(alphabet, pattern) such that the pattern is avoidable over the alphabet is known only 
for patterns with at most three variables [HI [HI [13l [181, [23] an d for the patterns that 
are not avoidable over any alphabet [HES]. There are very few papers about avoidable 
sets of patterns; we only mention a result by Petrov [15J. The only exception is the set 
{xxx, xyxyx}, defining the binary overlap- free language which is quite well presented in 
literature starting from the seminal paper by Thue [23] • 

There are some scattered results concerning the second question (cf. [3[T3]). To the 
best of our knowledge, the only example of a pair (alphabet, pattern) such that the 
language avoiding the pattern over the alphabet grows subexponentially with the length, 
was found in [3J: a 7-ary pattern avoidable over the quaternary alphabet. All infinite 
languages avoiding a binary pattern grow exponentially (combined [TIHTJ ) . However, the 
binary overlap-free language has polynomial growth [16J. 

In this paper we start a systematic study of both questions formulated above for the 
languages specified by a pair of forbidden patterns. It is quite natural to begin with the 
binary main alphabet and consider the patterns of two variables also. For the first step, it 
is also natural to fix one of the patterns to be xxx, which is the shortest pattern avoidable 
over two letters. This step is in line with other studies of binary cube-free words with 
additional constraints (see, e.g., [2]). In this setting, the aim of this paper is to describe 
the avoidability of binary patterns by the binary cube-free words and the order of growth 
of avoiding languages. This description is given by the following theorem. Recall that 
an avoidable set of structures is called DOL- avoidable if it is avoided by an infinite word 
generated by the iteration of a morphism. 

Theorem 1.1 (Main theorem). Let P e {x, y}* be a binary pattern. 

1) The set {xxx, P} of patterns is 2-avoidable if and only if P contains as a factor at least 
one of the words 

xyxyx, xxyxxy, xxyxyy, xxyyxx, xxyyxyx, xyxxyxy, xyxxyyxy, (1) 

considered up to negation and reversal. 

2) All 2-avoidable sets {xxx, P} are 2-DOL-avoidable. 

3) For all 2-avoidable sets {xxx, P}, except for the set {xxx, xyxyx}, the avoiding binary 
language has exponential growth. 

This is an "aggregate" theorem, the proof of which does not follow a single main 
line but uses quite different techniques. So, we present this proof as a sequence of lesser 
theorems. Some of these theorems contain refinements to the main theorem (e.g., lower 
bounds for the growth rates of avoiding languages). 

Statement 3 of Theorem 11.11 leaves little hope to find a subexponentially growing 
binary language avoiding a pair of patterns; so, we finish the paper by showing an example 
of such a language avoiding a triplet of binary patterns. 

The text is organized as follows. After necessary preliminaries, in Sect. [3] we prove 
statement 1 of Theorem II. 1( our proof immediately implies statement 2. In Sect. HI we 
finish the proof of Theorem II. If exhibiting exponential lower bounds for the cube-free 
languages avoiding the pattern xyxyxx and all patterns from (pQ), except for the pattern 
xyxyx. In Sect. 5 we estimate actual growth rates of avoiding languages through the upper 
bounds obtained by computer. Finally, in Sect. 6 we give a new example of a language of 
polynomial growth. This language consists of cube-free words avoiding a pair of binary 
patterns. 
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2 Preliminaries 



We study finite, right infinite, and two-sided infinite sequences over the main alphabet 
{0, 1} and call them words, u-words, and Z-words, respectively. We also consider patterns, 
which are words over the alphabet of variables {x, y}. Standard notions of factor, prefix, 
and suffix of a word are used. For a word w, we write \w\ for its length, w[i] for its ith 
letter, and w[i...j] for its factor starting in the ith position and ending in the jth position. 
Thus, w = w[l...\w\]. Letters in an cu-word are numbered starting with 1. For a binary 
word or pattern w, its negation is the word (resp., pattern) w such that \w\ = \w\ and 
w[i] 7^ w[i] for any i. The reversal of w is the word ■ ■ - A word w has period 

p if p] = w[p+l...\w\}. The exponent of a word is the ratio between its length 

and its minimal period. A word is fi-free if the exponent of any of its factors is less than 
f3. Two words are conjugates if they can be represented as uv and vu, for some words 
u and v. If a word uv has an integer exponent greater than 1, then vu has the same 
exponent. 

A language is just a set of words. A language is factorial if it is closed under taking 
factors of its elements. Any factorial language L is determined by its set of minimal 
forbidden words, i.e., the words that are not in L while all their proper factors are in 
L. The growth rate of a factorial language L is defined as Gr(L) = lim n ^ 00 (Ci(n)) 1 / n , 
where Ci(n) is the number of words of length n in L. An infinite language L grows 
exponentially [subexponentially] if Gr(L) > 1 [resp., Gr(L) = 1]. A word w is said to be 
(two-sided) extendable in the language L if L contains, for any n, a word of the form uwv 
such that \u\, \v\ > n. The set of all extendable words in L is denoted by e(L). 

A morphism is any map / from words to words satisfying the condition f(w) = 
f(w[l}) ■ ■ ■ f(w[\w\}) for each word w. A morphism is non-erasing if the image of any 
non-empty word is non-empty, and n-uniform if the images of all letters have length 
n. An n-uniform morphism / is called k- synchronizing if for any factor of length k of 
any word f{w), the starting positions of all occurrences of this factor in f(w) are equal 
modulo n. 

A word w meets a pattern P if an image of P under some non-erasing morphism is a 
factor of w; otherwise, w avoids P. The images of the pattern xx [resp., xxx; xyxyx] are 
called squares [resp., cubes, overlaps]. The words avoiding xx [resp., xxx; both xxx and 
xyxyx] are square-free [resp., cube-free, overlap-free}. 

If / is a non-erasing morphism and f(a) = au for a letter a and a non-empty word 
u, then an infinite iteration of / generates an w-word denoted by f = f°°(a). The u- 
words obtained in this way are called DOL-words or purely morphic words. The images 
of letters under a morphism / are called f -blocks. Note that the DOL-word f is a product 
of /-blocks, and also of /"-blocks for any n > 1, because the morphism f n generates the 
same DOL-word f . 

The Thue-Morse morphism is defined by the rules 6(0) = 01, 6(1) = 10 and generates 
the Thue-Morse word t = 6°°(0). The factors of t are Thue-Morse factors. We use the 
notation = 6 k (0) and = 6 h (l) for 6 k -blocks. The properties listed in Lemma 12.11 
below are well known and follow by induction from the facts that t is a product of ^-blocks 
and #(t) = t. The third property was first proved by Thue [21]. In the same paper, Thue 
proved that t is an overlap-free word. 

Lemma 2.1. 1) The number of Thue-Morse factors of length n is G(n). 

2) For any fixed k, the number of pairs of equal adjacent 6 k -blocks in any Thue-Morse 

factor of length n is n/(3-2 k ) + 0(1). 
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3) Ifvv is a Thue-Morse factor, then v is either a 6 k -block or a product of three alternating 
6 k -blocks, for some k > 0. The position in which vv ends in t is divisible by 2 k but not 
by 2 k+1 . 

A set of patterns (in particular, a single pattern) is 2-avoidable if there exists a binary 
w-word avoiding this set, and 2-D0L- avoidable if such a DOL-word exists. The existence 
of an avoiding w-word is clearly equivalent to the existence of an infinite set of avoiding 
finite words. 

3 Avoidable and unavoidable patterns 

In this section we classify the binary patterns avoidable by binary cube-free words. As was 
already mentioned, the pattern xyxyx is avoided by the Thue-Morse word. The following 
observation can be easily checked by hand or by computer. 

Observation 3.1. All binary patterns of length at most 5, except for the pattern xyxyx, 
are unavoidable by binary cube-free words. 

Next we focus our attention on the patterns of length 6. For both avoidability and 
growth, the patterns can be studied up to negation and reversal. Thus, we obtain the 
list of eight patterns: 

xxyxxy, xxyxyx, xxyxyy, xxyyxx, xxyyxy, xyxxyx, xyxyyx, xyyxxy. (2) 

The pattern xxyxyx is obviously avoided by the Thue-Morse word as it has the factor 
xyxyx. The pattern xxyxxy is also avoided by the Thue-Morse word, as was first mentioned 
in [8]. (For the complete set of binary patterns avoided by the Thue-Morse word see 
[2D].) The last four words from the list (j2J) are unavoidable, as can be easily checked by 
computer. The longest cube- free words avoiding these patterns are listed in Table [TJ The 
remaining two patterns xxyyxx and xxyxyy are avoidable, see Theorems 13. II and 13.21 below. 

It follows immediately from the classification of patterns of length 6 that almost all 
binary patterns of length 7 are avoidable. Only three patterns of length 7, namely, 

xxyyxyx, xyxxyxy, xyxxyyx, 

have no proper avoidable factors. The last of these patterns is unavoidable (see Tabled]), 
while the first two are avoidable (see Theorem 13. 2j) . Finally, there is a unique pattern 
xyxxyyxy of length 8 for which all proper prefixes and suffixes are unavoidable. But this 
pattern is avoidable by the Thue-Morse word [20J. 



Table 1: Longest avoiding cube- free words for unavoidable patterns. 



Pattern 


Longest avoiding cube-free word u 


\u\ 


xxyyxy 


010100101101001011010010011001100 


33 


xyxxyx 


00110101100101001101011001001101100101001101011001010011 


56 


xyxyyx 


001100100110110010011011001001011 


33 


xyyxxy 


0011011010010100101101100 


25 


xyxxyyx 


0011001100100101101001011010010100101101100 


43 



Thus, we have reduced statement 1 of Theorem 11.11 to the proof of Theorems 13.11 
and 13.21 Since all avoidability proofs are obtained by constructing DOL-words, we also 
get statement 2 of Theorem 11.11 
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Theorem 3.1. There exists a binary cube-free DOL-word avoiding the pattern xxyyxx. 

Consider the morphism /i defined by the equalities /i(0) = 010, = 0110, and the 
DOL-word m = /i°°(0). Some properties of the word m are gathered in the following 
lemma. 

Lemma 3.1. Let k be an arbitrary nonnegative integer. 

1) One has m[3fc+l] = and m[3A;+2] = 1. 

2) The last letter in the block fj. k (a) is a. All other letters in /i fc (0) and /x fc (l) coincide. 

3) If u is a factor of m and \u\ = 3 k , then the starting positions of all occurrences of u 
in m are equal modulo 3 k . 

4) If m contains a square uu and 3 k < \u\ < 3 k+1 , then \u\ £ {3 fc ,2 • 3 fc }. 

5) Suppose that m[ri3 +c. . . r 2 3 fc +c— 1] is a square for some integers r 1; r 2 ,c such that 
< c < 3 k . Then the word m[ri3 fc +l . . . r23 k ] is a square as well. 

Proof. Properties 1 and 2 follow immediately from the definition of fi. Let us prove 
property 3 by induction on k. 

The base cases are k = (holds trivially) and k = 1, which follows directly from 
property 1. Now we let k > 2 and prove the inductive step. Assume to the contrary that 
two occurrences of the factor u of length 3 fc have starting positions j\ and j 2 that are 
different modulo 3 k . These positions are also the starting positions of the occurrences 
of the factor u[l . . . 3 fc_1 ]. Hence, ji = j 2 (mod 3 ) by the inductive assumption. By 
property 2, both considered occurrences of u are preceded by the same (j\ mod 3 fc_1 ) — 1 
letters. Thus, m contains a factor u' such that \u'\ = 3 k , u' is a product of yU fc_1 -blocks, 
and the starting positions of two occurrences of u' are different modulo 3 fc . Hence, m 
also contains the factor /z -1 (-u') of length 3 fc_1 such that the starting positions of two 
occurrences of fi~ l (u') are different modulo 3 fc_1 , in contradiction with the inductive 
assumption. Therefore, property 3 is proved. 

Property 4 is an immediate consequence of property 3. In order to prove prop- 
erty 5, we note that property 2 implies the equality m[ri3 k +l . . .ri3 k +c—l] = 
m[r 2 3 fc -|-l . . . r 2 3 k +c— 1]. Hence, the two considered words are conjugates. But all conju- 
gates of a square are squares. □ 

Proof of Theorem \3.1[ Let us prove that m is cube- free and avoids xxyyxx. Aiming at 
a contradiction, first assume that m contains a cube; consider the shortest one, say u 3 . 
Then \u\ > 2 in view of Lemma 13-H 1. Hence \u\ = (mod 3) Lemma 13.11 4. Using 
Lemma l3.1[ 5. we find a cube u' 3 which is a product of /^-blocks. Then m contains the 
cube (/i _1 (M')) 3 , in contradiction with the choice of u 3 . 

The argument for the pattern xxyyxx is essentially the same. If m has a factor uuvvuu, 
then Lemma I3TT| 1 implies that at least one of the numbers \u\, \ v\ is greater than 2. Then 
this number is divisible by 3 by Lemma 13.11 4, and hence the other number is divisible by 
3 too (Lemma 13. 1\ 3). Therefore, we can apply Lemma [3.11 5 to get a factor u'u'v'v'u'u' 
which begins with the starting position of a /i-block. Then m contains a shorter forbidden 
factor {u'u'v'v'u'u'), contradicting to the choice of uuvvuu. □ 



1 The morphism fi' defined by /i'(0) = 001, //(l) = 011, also avoids {xxx, xxyyxx} (see [19]; indepen- 
dently discovered by J. Shallit, private communication). We prefer the morphism \i because its study 
allows us to prove that the avoiding language grows exponentially (see Theorem l4.3p . 
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Theorem 3.2. There exist binary cube-free DOL-words avoiding the patterns xxyxyy, 
xxyyxyx, and xyxxyxy, respectively^. 



Our proof involves a rather short computer check based on the following two lemmas. 

Lemma 3.2 (Richomme, Wlazinski, [I?]). A morphism f : {0,1} — > {0,1} is cube-free 
if and only if the word /(001101011011001001010011) is cube-free. 

Lemma 3.3. Suppose that an u-word f is generated by a k- synchronizing n-uniform 
cube-free binary morphism f , and P G {xxyxyy, xyyxyx}. Then f meets P if and only if f 
contains the factor g(P) for some morphism g satisfying |<?(x)|, |g(y)| < k. 

Proof. We assume that the word f contains a factor of the form g(P) such that 
max{|p(x)|, |<?(y)|} > k and prove that f must contain a shorter image of P. Let 
x 1 = g(x),y' = g(y), \x'\ > k. The starting positions of all occurrences of x' in f are equal 
modulo n by the definition of ^-synchronizing morphism. Considering the occurrences 
inside g{P), we see that \x'\ = \x'y'\ = (mod n) if P — xxyxyy and \x'y'y'\ = \x'y'\ = 
(mod n) if P = xyyxyx. Thus, in both cases \x'\ and \y'\ are divisible by n. The assump- 
tion \y'\ > k leads to the same result. 

Now we can write x' = Xix 2 x 3 ,y' = 2/12/22/3, where £1,2/1 [respectively, £2,2/2; £3,2/3] 
are suffixes [respectively, products; prefixes] of /-blocks, |xx| = \yi\ = r, |£ 3 | = |y 3 | = /, 
l+r = n. An /-block is determined either by its prefix of length I or by its suffix of length 
r. Thus, f contains another image of P of length \g(P)\: the starting position of this 
image is either r symbols to the right or I symbols to the left from the starting position 
of g(P). This new image h(P) is a product of /-blocks. As a result, h(x) and h(y) are 
products of /-blocks also. Hence, f contains an image of P under the composition of f^ 1 
and h; this image is shorter than g(P), as required. □ 

Proof of Theorem VJ.tA Consider the morphisms hi, h 2 , and h 3 such that 

hi{0) = 0110010 h 2 {0) = 01001 h 3 {0) = 010011 
hi(l) = 1001101 h 2 (l) = 10110 h 3 (l) = 011001 

Checking the condition of Lemma 13.21 by computer, we obtain that all these morphisms 
are cube-free. Furthermore, it can be directly verified that hi, h 2 , and h% are 6-, 6-, 
and 5-synchronizing, respectively. Hence, if the DOL-word hi generated by hi meets the 
pattern P = xxyxyy, then by Lemma 13. 'S\ hi contains an image of P of length at most 
5 ■ 6 = 30. Thus, it is enough to check all factors of hi of length at most 30. Any such 
factor is contained in the image of a factor of hi of length 6; this factor, in turn, belongs 
to the image of a factor of length 2, while all factors of length 2 can be found in hi(0). 
Therefore, we just need to examine all factors of length up to 30 in the word /^(O). A 
computer check shows that there are no images of P among such factors. So, we conclude 
that hi avoids both cubes and the pattern xxyxyy. 

Similar argument for the morphism h 2 and the pattern xxyyxyx, containing xyyxyx, 
shows that it is enough to examine the factors of length up to 35 in the word h^O). A 
computer check implies the desired avoidability result. In the same way, we check the 
factors of length up to 28 in /i|(0) to show that the corresponding DOL-word avoids the 
pattern xyxxyxy. The theorem is proved. □ 

2 2-D0L-avoidability of the pattern xxyxyy was first observed by J. Cassaigne who found a 12-uniform 
avoiding cube-free morphism (private communication). This pattern is also avoided by a cube-free 
"quasi- morphism" defined in |19j . 
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4 Lower bounds for the growth rates 



In this section we prove lower bounds for the growth rates of the languages avoiding the 
sets {xxx, P}, where P = xyxyxx or P is any of the patterns listed in (pQ), except for the 
pattern xyxyx. In particular, the results of this section imply statement 3 of Theorem ll.il 
The bounds are obtained using two different methods. The first method uses block 
replacing in the factors of DOL-words, and is purely analytic. We apply this method to 
the patterns xyxyxx, xxyxxy, xxyyxx. The second method uses morphisms that act on the 
ternary alphabet and map ternary square-free words to binary cube-free words avoiding 
the given patterns. This method requires some computer search and check; we apply it 
to the remaining four patterns. (The second method can be applied for all patterns, but 
the analytic bounds bit better.) 



4.1 Replacing blocks in DOL-words 

Theorem 4.1. The number of binary cube-free words avoiding the pattern xyxyxx grows 
exponentially with the rate of at least 2 1//24 « 1.0293. 

Proof. Let L be the language of all binary cube-free words avoiding xyxyxx. Recall that 
L contains all Thue-Morse factors. Consider the "distorted" # 5 -block 



t' = 0110 1001 1001 1 0110 1001 0110 0110 1001, (3) 

obtained from the block t§ by inserting the letter 1 in the 13th position, and its negation 
t' obtained by inserting a in the same way into £5. Let S be the set of all w-words that 
can be obtained from the Thue-Morse word t by replacing some of its # 5 -blocks by the 
corresponding distorted blocks. Available places for inserting letters are shown below: 

t 5 

1 1 1 (T 1 1 

t = I t2t2t2t2t2t2t2 | t2t2t2t2t2t2t2 | t2t2t2t2t2t2t2 | t2t2t2t2t2t2t2 | t2 - . /^\ 
*5 t 5 t 5 t 5 

Let z G S. It is easy to check manually that z does not contain short cubes; as it will be 
shown below, z does not contain long overlaps, and hence has no cubes at all. Now, our 
goal is to prove that z avoids the pattern xyxyxx. 

Claim. Let w = uvuvuu be a minimal forbidden word for L. Then u G {0, 1, 01, 10}. 

Assume that \u\ > 1. Then u[i] 7^ w[z+l] for all i and, moreover, u[\u\] 7^ u[l}. Indeed, 
otherwise w[z...2|m>|-M+l] is an image of xyxyxx, a contradiction with the minimality of 
w. Hence, u G {(01) s , (10) s }. Since uu is not forbidden, s = 1. The claim is proved. 

Let us consider the overlaps in z. The case analysis below is performed up to negation. 
Each overlap surely contains at least one inserted letter. Two short overlaps can be easily 
observed inside the word t', see ©. They are t' [5. ..14] = 1001100110 and t'[11...18] = 
01101101. These overlaps obviously avoid the pattern xyxyxx. One can easily check that 
there is no other overlap of period < 10. Note that the words 00110011 and 11011 are 
not Thue-Morse factors and thus their occurrences in z indicate an inserted letter (the 
bold one). 

Now assume to the contrary that some word z G S meets the pattern xyxyxx. Let 
w = uvuvuu be the shortest word among the images of xyxyxx in all words z G S. 
We already know that \uv\ > 10. So, if v contains an inserted letter then one of the 
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corresponding "indicators" 00110011 and 11011 occurs inside uvu. Hence, the same 
letter was inserted in the other occurrence of v. If we delete both these inserted letters 
from w, we will get a shorter image of xyxyxx, contradicting to the choice of w. Thus, w 
contains inserted letters only inside u. Recall that \u\ < 2 by the claim. 

Assume that the letter 1 was inserted inside the second (middle) occurrence of u. 
Then if \u\ = 1, the Thue-Morse word contains the square vv . If \u\ = 2 then u = 10, 
because the inserted letter is preceded by the same letter. So, the in the first (left) 
occurrence of u is not an inserted letter. Thus, Of Of is a square in t. Then the word 
f or the word Of should be either a # fc -block or a product of three alternating # fc -blocks 
(Lemma l2.lt 3). But v ends with 01100110, see ([2]), so we get a contradiction. 

Now note that if an inserted 1 is in the third (right) occurrence of u , then the cor- 
responding indicator 00110011 occurs in the suffix uvu of w. Hence 00110011 occurs in 
the prefix uvu of w. Thus, 1 was also inserted inside the middle occurrence of u , which 
is impossible as we have shown already. So, the only remaining position for the inserted 
letter is in the left occurrence of u. Then \u\ = 1 (otherwise, t contains an overlap), and 
vuvu is a factor of t. But the inserted letter is preceded by the same letter, so uvuvu 
must be a Thue-Morse factor. This contradiction finishes the proof of the fact that the 
word z avoids xyxyxx. 

Thus, we have proved that all finite factors of the word z belong to L. To finish the 
proof, we take a large enough number n and consider all Thue-Morse factors of length n. 
For each factor, we perform the insertions of letters into # 5 -blocks according to both 
and the negation of ([3D, in all possible combinations. Thus we obtain 2 k words from L, 
where k stands for the number of # 5 -blocks in the processed factor. Note that the words 
obtained from different factors are different (for instance, such words contain indicators 
in different positions). A Thue-Morse factor of length n contains n/32+0(l) "regular" 
# 5 -blocks plus those ^-blocks occurring on the border of two equal ^-blocks, see (jlj). 
Using Lemma 12.11 2. we obtain the total of n/24 + 0(1) blocks. Taking Lemma 12.11 1 
into account, we see that we constructed B(n)2 n//24+ ° ( - 1 ^ words from L, and the lengths 
of these words cover the interval of length Q(n). Therefore, the growth rate of L is at 
least 2 1 / 24 , as desired. □ 

Theorem 4.2. The number of binary cube-free words avoiding the pattern xxyxxy grows 
exponentially with the rate of at least 2 1//24 w 1.0293. 

Proof. As in the proof of Theorem I4.1[ we get an exponential lower bound using multiple 
insertions into the Thue-Morse word. But now we need to insert rather long words, not 
just letters. Let L be the language of all binary cube-free words avoiding xxyxxy. Recall 
that L contains all Thue-Morse factors. Consider the word 

t' = 0110 1001 1001 0110 1001 0110 01010 01 1001 0110 1001 0110 0110 1001, (5) 

obtained from the # 5 -block t 5 by inserting the marked factor s of length 23 in the 25th po- 
sition, and its negation t' obtained by inserting s in the 25th position of t 5 . One can check 
directly that both t' and t' are cube-free and avoid xxyxxy. Note that t'[25...29] = 01010, 
but t'[1...28] is an overlap-free word ending with the square 0(100100), and t'[26...55] is 
a Thue-Morse factor. Let S be the set of all cf-words that can be obtained from the 
Thue-Morse word t by replacing some of its (9 5 -blocks by the corresponding blocks t', t'. 
Let us consider a successive pair of inserted factors in z 6 5* (here a, b G {0, 1}): 

overlap-free word 
l~ w 1 
z= --- daaaa bbhbb ••• (Q) 
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We see that w is a Thue-Morse factor, wbb is an overlap-free word with the suffix (6(bbb)) 2 . 
Moreover, assume for a moment that the left of the two considered insertions is withdrawn; 
then the factor wbb still would occur in the same place. 

Let us show that an cu-word z G S contains no overlap except for 01010 and 10101. 
Assume to the contrary that such overlaps exist. Consider the overlap w = uvuvu 
which has the shortest period (among the overlaps in all z G S) and is not extendable 
(i.e., is not contained in a longer factor of z with the same period). In view of (jHJ), w 
should contain the factor 10101 or 01010. We assume w.l.o.g. that w contains 01010 
and itf[z...i+4] = 01010 is the rightmost occurrence of this factor in w. This occurrence 
in certainly not inside the prefix uvu of w. Suppose that this occurrence is inside the 
suffix uvu of w. Then we have w[i— \uv\...i— \uv\+4] = 01010. Both these occurrences of 
01010 are prefixes of the occurrences of s in z. Since the leftmost of these occurrences 
of s is obviously inside w, the rightmost one is also inside w due to non-extendability of 
w. Moreover, non-extendability of w implies that the rightmost occurrence of s is not 
a suffix of w, because s is always followed by t 3 . Now we can delete both mentioned 
occurrences of s and get an overlap with a smaller period in contradiction with the choice 
of w. One case of mutual location of the factors of w is depicted below, the others are 
quite similar. Deleting the occurrences of s in the case presented in the picture gives the 
overlap u 2 V\U2V\U2- 



W = \U\U2\ V\ V2 \U\U2\ Vl V<1 \U\U2\ 

Thus, it remains to consider the case when the rightmost occurrence of 01010 in w = 
uvuvu strictly contains the middle u. Since t' contains no overlaps except for 01010, we 
conclude that \s\ < \uv\. Then the mutual location of the factors in w looks like in the 
following picture. 

10 1 1 01 I 

W = \U\Vl V2 V3 V4\ U\Vl V2 ^3 Vj\ U | 

The word V3 begins and ends with 0, and Uj also begins with 0, see (J5J). Then the word 
V3V3V4 begins with a shorter overlap, and this overlap contains at least five zeroes. Since 
the word V3V3V4 occurs in an w-word from S, we get a contradiction with the minimality 
of the period of w. Thus, we have proved that the "long" overlap w does not exist. 
Therefore, all w-words from S contain no overlaps except for 01010 and 10101 and, in 
particular, are cube-free. 

Now assume that z G S contains an image of the pattern xxyxxy, i.e., the factor 
w = uuvuuv for some nonempty words u, v. W.l.o.g., this factor is preceded in z by 0. 
Since the word Ow is not an overlap, the word v ends with 1. Then Ow contains both 
factors Ouu and luu. But one of the words Ouu, luu is an overlap, i.e., is equal to 01010 
(resp., 10101). Let v = v'l and consider both cases. 

Case 1. Ow = OOlOlu'l OlOli/l. By (jSJ), v' ends with 1. Then the word w is followed 
by 0. Hence, wO is an overlap, which is impossible. 

Case 2. Ow = lOlOv'l lOlOv'l. There is no factor 01010 or 10101 on the border 
between the left and the right uuv. Hence, the factors s and s in w, if any, are inside 
uuv (recall that w is not extendable to the right, because z has no long overlaps). There- 
fore, after deleting all occurrences of s and s in w, we will still have a square of the 
form (1010. ..) 2 . But the Thue-Morse word has no such squares, see Lemma f2.ll 3. This 
contradiction proves that the cu-word z avoids the pattern xxyxxy. 
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It remains to estimate the total number of factors in all words z. Repeating the 
argument from the proof of Theorem 14. 1[ we arrive at the same bound 2 1 / 24 . □ 

Theorem 4.3. The number of binary cube-free words avoiding the pattern xxyyxx grows 
exponentially with the rate of at least 2 1 / 18 m 1.0392. 

Proof. Proving this lower bound, we cannot rely on the Thue-Morse word, because it 
meets the pattern xxyyxx. Instead, we apply the insertion technique to the DOL-word m 
generated by the morphism //, introduced in Sect. [3j The word m is a product of //-blocks, 
as well as of /i 2 -blocks, and has no other occurrences of such blocks by Lemma l3.1[ 3. 
Consider the word 

m' = 010 011010 01010 (7) 

obtained by attaching the factor 01010 to the block /t 2 (0). Let S be the set of all u- words 
obtained from m by replacing some of the blocks /t 2 (0) by the words m' (in other words, 
by inserting the factor 01010 after some blocks /i 2 (0)). 

(A) If one inserts 01010 after u in a word uv G S, then u is followed by 010 [resp., v is 
preceded by 1010] both before and after insertion. 

Note that 0101 is not a factor of m by Lemma [3.1[ 1, and hence we use this word as a 
"marker". Let us show that S avoids {xxx, xxyyxx}. Assume to the contrary that some 
z G S has a forbidden factor, and w is the shortest one among all forbidden factors of all 
words z G S. Since w is not a factor of m, it contains at least one marker 0101. 

The word w equals either to uuu or to uuvvuu, for some words u, v. If u or v contains 
the factor 01010, then one can cancel the corresponding insertions inside each occurrence 
of this word, thus getting a shorter forbidden factor in contradiction with the choice of 
w. Thus, all inserted factors inside w are on the borders of its parts. 

Let w = uuu. Using the fact that m' is always followed by a /A 2 -block, it is easy to 
check that \u\ > 5. If uu contains 01010 somewhere in the middle, then 01010 = z±Z2 and 
u = z%v! Z\. Hence, after cancelling the two insertions inside uuu, one obtains a shorter 
cube u'u'u', a contradiction. Finally, if uuu ends with 0101, then this marker is a suffix 
of u, and we get the previous case. Thus, w has no markers, a contradiction. 

Now let w = uuvvuu. First consider the case where either u or v lies strictly inside 
some factor 01010 (and hence, is equal to 01 or 10). If u = 01, then vv = Oz, where z is 
either a product of /i 2 -blocks or such a product with 01010 inserted in the middle. In the 
first case Oz is a factor of m and hence is not a square by Lemma 13.11 4. In the second 
case, the right half of Oz cannot begin with 00 as Oz itself does; once again we see that 
Oz is not a square. The case u — 10 and vv = zO is symmetric to the above one. 

If v = 01 [v — 10], then u begins with 00 [resp., 0] and ends with [resp., 00], implying 
that uu contains the cube 000, which is impossible. 

Thus, the factors 01010 can be found inside w only in the following places: 



w = \ u\ u\ v\ v\ u\ u\ 

(In addition, w can have the suffix 0101; in case of any other partial intersection of 01010 
and w, the deletion of this occurrence of 01010 from z leaves w unchanged by (A).) 

Consider any square in z containing 01010 in the middle. Such a square xx can be 
written in the form z^x! Z\ z^x' 'z\, where Z\Z2 = 01010. Then x' is a square in m, and thus 
\x'\ equals 3 fc or 2 • 3 fc for some k > by Lemma f3.ll 4. One can easily see that trying 
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\x'\ = 1,2,3,6, it is impossible to obtain both squares x'x' and xx. Hence, x' must be a 
product of /i 2 -blocks ending with the block u 2 (0). Now we proceed with the case analysis. 
Case 1: both uu and vv contain 01010 in the middle. Then 

w = z 2 u z\ zivl z\ Z4V' ' z% Z4V' ' zj, z 2 u z\ z 2 u' z\, where ziz 2 = Z3Z4 = 01010 . 

Since u' and v' are products of /z 2 -blocks, we have Z1Z4 = z^z 2 = 01010. Hence, u'u'v'v'u'v! 
is a factor of m, a contradiction. 

Case 2: uu contains 01010, while vv not. Then w = z 2 u' z\z 2 u' Z\ vv z 2 u' z\Z 2 u' z\ and 
u! is a product of /i 2 -blocks. Four subcases are possible depending on the existence of 
insertions on the borders of u and v. 

Case 2.1: no insertions. Then Z\vvz 2 is a product of yU 2 -blocks, implying \vv\ = 4 
(mod 9). By Lemma l3TTl 4. v = 10 and z\vvz 2 = 011010 010. But this is not a yU 2 -block, 
a contradiction. 

Case 2.2: an insertion only on the left. Then v = z 2 v'. Let v = v'z 2 . Deleting all 
three insertions of z\z 2 = 01010 from w = z 2 v! Z\z 2 v! z\ z 2 v 'z 2 v' ' z 2 w ' Z\z 2 u ' Z\, one discovers 
the forbidden factor u'u'vvu'u' , contradicting the minimality of w. 

Case 2.3: an insertion only on the right, is symmetric to Case 2.2. 

Case 2.4'- insertions on both sides. Then v = z 2 v' = v"z\, where v'v" is a product of 
/4 2 -blocks. Hence \vv\ =5 (mod 9), which is impossible by Lemma 4. 

Case 3: vv contains 01010, while uu not. Note that in this case u cannot have the 
suffix 0101. Then 

• w = uu z 2 v ' z\z 2 v' z\ uu; 

• v ' is a product of yU 2 -blocks, ending with /i 2 (0) (in particular, v' = 010 • ■ ■ 1010); 

• uu is a factor of m (in particular, u has no factor 0101). 

If \u\ = 1, then either the first letter of z 2 or the last letter of z\ equals u, implying that 
w contains a cube of a letter, which is impossible. The assumption \u\ = 2 (i.e., u — 10) 
also leads to a contradiction for all values of Z\. Namely, if Z\ ends with 0, then uuz 2 
begins with (10) 3 ; if Zi = 01, then z^uu = 011010 is not a valid beginning of a /z 2 -block; 
finally, z x =0101 must be followed by 0, not by 1. Thus, \u\ = (mod 3). Let us analyze 
the possible values of z\. 

Case 3.1: z\ = 0101, v = 0t/0101, w = uuOv'OIOIOv'OIOIuu. Since u cannot end 
with 0101, w has exactly two occurrences of 01010 (u[l] = 0). Let us put v' = v"0, 
v = Of". Deleting both occurrences of 01010, we obtain the forbidden word uuvvuu 
which is shorter than w, a contradiction. 

Case 3.2: z x = 010, v = lOt/010, w = uulOv'OlOWv'OWuu. If there no factor 01010 
on the left border of vv, then uu ends in m in the position equal to 7 modulo 9. If this 
factor appears there, then uu ends in m in the position equal to 3 modulo 9. Similarly, 
if there is the factor [resp., no factor] 01010 on the right border of vv, then uu begins in 
the position equal to 8 modulo 9 [resp., to 4 modulo 9]. Since \u\ — (mod 3), exactly 
one factor 01010 should occur at the borders of vv. If this factor is on the left, we 
put v' = 010f". Then deleting both factors 01010 we obtain a shorter forbidden factor 
uuv" 010v"010uu to get a contradiction (observe that the deleted suffix 010 of the second 
u is replaced by the prefix 010 of v'). Similarly, if the factor is on the right, we put 
v' = v"10 to obtain, after the deletion, a shorter forbidden factor uu\0v"l0v"uu. 

For Case 3.3: z\ = 01 and Case 3.4'- Z\ = 0, the same analysis as in Case 3.2 works. 



11 



Case 4 '■ neither uu nor vv contains 01010 in the middle. Then both uu and vv are 
factors of m. We obtain contradictions between the length of uu and its starting and 
ending positions in m. 

Case 4.1: 01010 was inserted at the left border of vv. Since 01010 is followed by 
010011, we see that either v = 010 or \v\ = (mod 9) by Lemma [3. 11 4. In the first case, 
the starting position of uu equals 4 modulo 9, and its ending position equals 2 modulo 9, 
contradicting Lemma [3.1] 4. If |u| = (mod 9), let the starting position of vv be equal 
to k modulo 9. Then the ending position of uu equals k+4 modulo 9, while its starting 
position equals either k— 5 or k modulo 9, depending on the existence of the factor 01010 
at the right border of vv. In both cases, we have a contradiction with Lemma 13.1] 4. 

Case 4-2: 01010 was inserted only at the right border of vv. Similar to Case 4.1, we 
analyze the possible lengths of v (\v\ — 1, \v\ — 6, and \v\ = (mod 9)), obtaining that 
the length of u cannot satisfy Lemma [3. 11 4. 

Thus, we finished the case study, obtaining contradictions in all cases. Hence, the 
forbidden word w does not exist, and the set S avoids {xxx, xxyyxx}. Finally, we estimate 
the total number of factors in all words z e S, similar to the proof of Theorem 14.11 
The word m has 0(n) factors of length n; this follows, e.g., from Pansiot's classification 
theorem (see [ID])- It is clear that such a factor contains n/2 + 0(1) zeroes and then, 
n/ 18 + 0(1) factors /x 2 (0). The latter quantity coincides with the number of places for 
insertions of the factor 01010. Thus, from the factors of m of length n we can construct 
0(n)2 n / 18+o W factors of words from S. The lengths of these factors cover the interval of 
length 0(n). Therefore, the growth rate of the binary language avoiding {xxx, xxyyxx} is 
at least 2 1//1S , as required. □ 

4.2 Mapping ternary square-free words 

In this section, we explore another approach for getting lower bounds. Namely, the 
fact that the language of ternary square-free words has exponential growth leads to the 
following simple observation. 

Observation 4.1. If an n-uniform morphism f : {0,1,2}* — > {0,1}* transforms any 
square-free ternary word to a binary word avoiding {xxx, P}, then the number of such 
binary words grows exponentially at rate at least a 1 ^, where a is the growth rate of the 
language of ternary square-free words. 

The morphisms with the desired properties can be obtained using the method de- 
scribed in [13] . The number a is known with a quite high precision: 1.3017597 < a < 
1.3017619 (cf. [23]). 

Theorem 4.4. The number of binary cube-free words avoiding the pattern P, where 
P G {xxyxyy, xxyyxyx, xyxxyxy, xyxxyyxy}, grows exponentially with the rate of at least 

• a 1 / 14 fa 1.0190 for P = xxyxyy, 

• a 1 / 13 ~ 1.0205 for P = xxyyxyx, xyxxyxy, 

• a 1 / 10 1.0267 for P = xyxxyyxy. 

Proof. In the proof of Theorem 13.21 we used morphic preimages to reduce the proof of 
pattern avoidance to the exhaustive search of forbidden factors in short words. Since we 
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cannot iterate morphisms acting on alphabets of different sizes, here we need a different 
argument for such a reduction. For this purpose, we construct binary words avoiding 
simultaneously cubes, the pattern P, and large squares. We use the notation St for the 
t-ary pattern (x x ■ • ■ x t ) 2 . 

Consider the morphisms gx, g2, #3, and such that 



0i (0) 


= 01011001100101 


02(0) 


= 0100110011011 


0i(l) 


= 00110110010011 


02(1) 


= 0100101101001 


0i(2) 


= 00101001101011 


02(2) 


= 0011011001001 


93(0) 


= 0010110110011 


04(0) 


= 0101100110 


03(1) 


= 0010110011011 


04(1) 


= 0101001011 


0s(2) 


= 0010011010011 


04(2) 


= 0100110010. 



For any square- free word w G {0,1,2}* we claim that 

• the word gi(w) avoids {xxx, xxyxyy, S$}; 

• the word g2{w) avoids {xxx, xxyyxyx, Sg}; 

• the word gs(w) avoids {xxx, xyxxyxy, S w }] 

• the word g&(w) avoids {xxx, xyxxyyxy, S 8 }. 

To prove this claim, we notice that for every binary pattern P considered in this 
section, both variables x and y are involved in a square. This implies that in a word 
containing only squares of bounded length, potential occurrences of P and of cubes have 
bounded length as well. So we can check exhaustively that gi(w) avoids cubes and P for 
all short square-free words w. Let a large square be an occurrence of S t - There remains 
to prove that if w is square- free, then giiw) does not contain large squares. The proof is 
the same for all four morphisms. 

Let Hi = \gi(a)\, a G {0,1,2}. First we check that the morphism g^ is 2rvsyn- 
chronizing. Indeed, any factor of gi(w) of length 2nj contains a ^-image of some letter a; 
but it is easy to see that for any letters a, b, c G {0, 1,2}, the factor gi{a) only appears in 
gi(bc) as a prefix or as a suffix. Then we check that no large square appears in the ^-image 
of a ternary square-free word of length 5. So, a potential large square uu in gi(w) is such 
that \u\ > 2rii and thus \u\ — qrii for some integer q > 3 by the synchronizing property. 
So uu is contained in the image of a word of the form w = avbvc with a, b, c G X 3 and the 
center of uu lies in gi(b). Moreover, a 7^ b and b ^ c since w is square-free. This implies 
that abc is square- free and that gi(abc) contains a square u'u! with = n«. Now u'u' 
is a large square because n« > t for all our morphisms g>j. This is a contradiction since 
no large square appears in the g^-image of a ternary square- free word of length 5. The 
claim, and then the theorem, is proved. □ 

Proving Theorem 14.44 we actually showed that the considered binary patterns can 
be avoided by binary cube-free words simultaneously with large squares. So, a natural 
problem is to find the exact bound for the length of these large squares. The following 
theorem gives this bound for all patterns listed in (JTJ). 

Theorem 4.5. Let P G {xxyxyx, xxyxxy, xxyxyy, xxyyxx, xxyyxyx, xyxxyxy, xyxxyyxy} and 

let t(P) be the number such that the set of patterns {xxx, P, St^} is 2-avoidable while the 
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set {xxx, P, St(p)-i} is 2-unavoidable. Then 

{4 if P = xxyyxx, 
5 if P E {xxyxyy, xxyyxyx, xyxxyxy, xyxxyyxy}, 
7 if P E {xxyxxy, xxyxyx}. 

and the binary language avoiding {xxx, P, Strp)} has exponential growth. 

Proof. Below we list the morphisms mapping ternary square-free words to the binary 
words avoiding the required sets. The proof of avoidability and exponential growth is the 
same as for Theorem 14.41 
P = xxyyxx, length = 62 

00100101101100101001101101001001101011001010011011001001101011 

1 00100101101100101001101100100110101100101001101101001001101011 

2 00100101101100101001101011001001101100101001101101001001101011 

P = xxyxyy, length = 88 

-> 00100110101100101001100110101100110010100110101100100110110010100 

11001101011001010011011 

1 00100110101100101001100110101100100110110010100110101100110010100 
11001101011001010011011 

2 00100110101100101001100110101100100110110010100110011010110010100 
11010110011001010011011 

P = xyxxyxy, length = 49 

-> 0011001011011001101001001100110101100101001101011 

1 -> 0011001011011001101001001100101101100101001101011 

2 -)> 0011001011011001001101011001010011011001001101011 

P = xxyyxyx, length = 32 

00100110110100100110011011010011 

1 -)> 00100101101001001101101001011011 

2 -> 00100101100110110100100110011011 

P = xyxxyyxy, length = 28 

0010010110100110011010110011 

1 0010010110100110010110110011 

2 0010010110011011010010110011 

P = xxyxyx, length = 44 

-> 00100110011010011001011001101001011011001101 

1 -> 00100110010110110011001011001101001100101101 

2 -)> 00100110010110011010010110110011001011001101 

P = xxyxxy, length = 66 

001010011001011001101001100101101001101011001101001100101001101011 

1 -4- 001010011001011001101001100101001101011001101001100101101001101011 

2 -4- 001010011001011001101001011001010011010110011010011001011001101011 

Unavoidability of shorter squares is verified by computer search. □ 

5 Growth rates: numerical results 

A general method to obtain upper bounds for the growth rates of factorial languages was 
proposed in [22J. An open-source implementation of this method can be found in [1]. We 
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adjust this method for each pattern under consideration and calculate the upper bounds 
for the growth rates of avoiding binary cube-free language. Here is a high-level overview 
of the method. 

Let L be a factorial language and M be its set of minimal forbidden words. If L is an 
infinite language avoiding a pattern, then M is also infinite. We construct a family {Mj} 
of finite subsets of M such that 

.Ui C M 2 C ••• C M, C ••• C M. Mi U M 2 U ■ • • U M t U • • • = M. 

Let Li be the binary factorial language with the set of minimal forbidden words Mj. One 
has 

L c • • • c u c • • • c Lx, Li n L 2 n • • • n u n • • • = L. 

It is not hard to show that the sequence of growth rates {Gr(Lj)} decreases and converges 
to Gr(L). The languages Lj are regular, and then the number Gr(Lj) can be found with 
any degree of precision. Increasing i, one can make the upper bound arbitrarily close to 
Gr(L). 

Thus, to obtain an upper bound for Gr(L) one should make three steps. First, build 
a set of minimal forbidden words Mj for the chosen i. Second, convert this set into a 
deterministic finite automaton recognizing Lj (the automaton should be both accessible 
and coaccessible). And finally, calculate the number Gr(Lj). If we calculate Mj by some 
search procedure and store it in a trie, then the second step can be implemented as 
a modified Aho-Corasick algorithm for pattern matching that converts the trie into an 
automaton having the desired properties. At the third step we calculate the growth 
rate of Lj with any prescribed precision by an efficient (linear in the size of automaton) 
iterative algorithm. The second and third steps are common for all factorial languages. 

For each pattern we use an ad-hoc procedure for constructing the set of minimal 
forbidden words for avoiding languages. In most cases we bound the length of the con- 
structed forbidden words with some constant. We iterate over the candidate forbidden 
words in the order of increasing length and check that they do not contain proper for- 
bidden factors, using already built shorter forbidden words for pruning. In practice, the 
described method allows us to construct and handle sets of thousands of forbidden words 
and automata of millions of vertices efficiently Some numerical results are presented in 
Table [2j For each of the processed languages, the sequence of obtained upper bounds 
converges very fast. So, the actual value of the growth rate in each case is likely to be 
quite close to the given upper bound. 

Table 2: Growth rates of binary cube- free languages avoiding binary patterns: upper 
bounds 



Pattern 


Upper bound 


Pattern 


Upper bound 


xxyxxy 


1.098891 


xyxyx 


1 (previously known) 


xxyxyy 


1.226850 


xxyyxyx 


1.310975 


xyxyxx 


1.138449 


xyxxyxy 


1.281612 


xxyyxx 


1.322304 


xyxxyyxy 


1.348932 



6 A language of polynomial growth 

Statement 3 of Theorem 11.11 proved in Sect. HI tells us that xyxyx is the only binary 
pattern that is avoided by a subexponentially-growing infinite set of binary cube-free 
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words. In this section, we present two binary patterns Pi and P2 such that the binary 
language avoiding {xxx, Px, P2} has polynomial growth. This language contains the binary 
overlap-free language and is incomparable with the binary (7/3)-free language (the latter 
one is the biggest binary /3-free language of polynomial growth [12J). Thus, this is an 
essentially new example of a language of polynomial growth. 

Theorem 6.1. The binary cube- free language avoiding both the patterns xyxyxx and 
xxyxyx has polynomial growth. 

Proof. Let L be the language of all binary cube-free words avoiding both xyxyxx and 
xxyxyx. Obviously, both L and its extendable part e(L) contain the set of all Thue-Morse 
factors. We aim to prove that this set coincides with e(L). The definition of extendable 
word implies that any word from e(L) is a factor of a Z-word all finite factors of which 
also belong to e(L). 

For any word from L, the factors 

000, 010101, 010100, 11001001, 10010011, 010010010, 

and their negations are forbidden. Hence, a word w G e(L) has no factor 01010, because 
any its extension to the right contains 010101 or 010100. Similarly, w has no factor 00100: 
extending this word, we inevitably meet one of the words 000, 11001001, 10010011, or 
(010) 3 . The same argument applies for 10101 and 11011. 

Claim. If a Z-word z has no factors 000, 01010, 00100, and their negations, then z is a 
product of ^-blocks. 

If two squares of letters in a word begin in positions of different parity, then this word 
surely contains one of the listed factors. To see this, just consider the closest pair of such 
squares. So, all squares of letters in z occur in positions of the same parity. Hence, one 
can factorize z into the factors of length 2 in a way that splits any square of a letter, thus 
getting the desired product. 

Consider a Z-word z all factors of which belong to e(L). By the claim, z is a product 
of 1-blocks. Consider its Thue-Morse preimage z' = 6~ 1 (z). The Z-word z' avoids the 
patterns xxx, xxyxyx, and xyxyxx. Indeed, if z' contains an image of a pattern under /, 
then z contains an image of the same pattern under Of. Hence, z' has no factors listed 
in the claim, and we conclude that it is a product of ^-blocks. Then z is a product of 
# 2 -blocks. Repeating this argument inductively, we obtain that z is a product of ^"-blocks 
for any n. Therefore, any finite factor of z is a factor of some (9 n -block, i.e., a Thue-Morse 
factor, as desired. 

The set of Thue-Morse factors contains 0(n) words of length n, and then has the 
growth rate 1. But the languages L and e(L) always have the same growth rate (see [211 
Theorem 3.1]), so our language L grows subexponentially. To prove that this growth is 
polynomial, some additional work is needed. 

Let us take an overlap w = OvOvO G L with \v\ > 2 and analyze how it can be 
extended within L. The words 0w,w0 are images of xxyxyx and xyxyxx, respectively, so, 
0w,w0 L. Note that v begins or ends with 1, because w has no factor 000. Assuming 
w.l.o.g. that v = lv' and extending w to the right by one symbol, we get a longer 
overlap: wl = Olu'Olu'01. We see that wll and wlOl are images of xyxyxx. Assume that 
wlOO = 0k/0k/0100 G L. 

If the last letter of v' is 1, then v ends with 11, because 010100 ^ L. Then v' cannot 
begin with 1, because the factor 11011 in the middle of w means that w contains a 
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forbidden factor (compare to the beginning of the proof). But if v' = Of", we see that 
the word wlOO = 01 Ov "OlO'i/'OlOO meets the pattern xyxyxx (x — > 0, y — > v"01). So, v 
ends with and then with 10. Then the word wlOOl ends with 1001001, guaranteeing 
that wlOOll, wjIOOIO ^ L. Thus, we have proved the following property. 

(A) Suppose that w = uvuvu G L, \uv\ > 4, and \u\ > 1. Then w can be extended 
within L by at most three letters to each side. 

Finally, we estimate the number of words in L that are not (7/3)-free. These words 
contain overlaps with \u\ > \v\/2. From (A) it follows that the set of words in L containing 
overlaps such that \u\ > 1 and \u\ > \v\/2, is finite. So, it remains to consider the case 
\u\ = 1 (and then |u| < 2). If \uv\ = 2, the overlap is 01010 or 10101. It cannot be 
extended within L. Now let \uv\ = 3. Such an overlap must contain the factor 00100 
or 11011, which cannot be extended within L to both sides simultaneously by more 
than one letter. Then the words from L containing an overlap of period 3 have the 
form 0010010^ or IOOIOOIOz up to reversal and negation. The number of such words 
grows polynomially because the word z is overlap- free. Since the number of (7/3)-free 
words is also polynomial, we get a polynomial upper bound on the number of words in 
L. □ 

Remark 6.1. Concerning the bounds for the degree of the polynomial growth of the 
language L considered in Theorem \ 6.1\ we have shown, in fact, that one can take the 
upper bound derived for the (7/3) -free language in The obvious lower bound stems 
from the fact that L contains all overlap-free words. 
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